Offline recognition of omnifont Arabic text using the HMM ToolKit (HTK)
نویسنده
چکیده
This paper presents a cursive Arabic text recognition system. The system decomposes the document image into text line images and extracts a set of simple statistical features from a narrow window which is sliding a long that text line. It then injects the resulting feature vectors to the Hidden Markov Model Toolkit (HTK). HTK is a portable toolkit for speech recognition system. The proposed system is applied to a data corpus which includes Arabic text of more than 600 A4-size sheets typewritten in multiple computer-generated fonts. 2007 Elsevier B.V. All rights reserved.
منابع مشابه
Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کاملOff-line Arabic Handwritten Isolated Character Recognition using Hidden Markov Models
This paper presents a recognition system for Arabic handwritten isolated characters. The recognition system is based on hidden Markov model (HMM). The entire system is capable of recognizing the Arabic handwritten characters. First, the system removes all the variation in the character images. Second, Features are extracted using the sliding window technique with HMM. Then, the HMM is used for ...
متن کاملArabic phonemes transcription using data driven approach
The efficiency and correctness of continuous Arabic Speech Recognition Systems (ARS) hinge on the accuracy of the language phoneme set. The main goal of this research is to recognize and transcribe Arabic phonemes using a data-driven approach. We used the Hidden Markov Toolkit (HTK) to develop a phoneme recognizer, carrying out several experiments with different parameters, such as varying numb...
متن کاملOnline Handwriting Recognition System for Assamese Language Based on Hmm and Svm Modelling
This work emphasises on the development of Assamese online character recognition system using HMM and SVM and performs a recognition performance analysis for both models. Recognition models using HTK (HMM Toolkit) and LIBSVM (SVM Toolkit) are generated by training 181 different Assamese Stokes. Stroke and Akshara level testing are performed separately. In stroke level testing, the confusion pat...
متن کاملIsolated English Language Digit Recognition Using Hidden Markov Model Toolkit
The main purpose of the study was to develop a speech recognition system for isolated digits of English language using HTK. Speech, in addition to being a tool of communication, is also a symbol of identity and authorization. Two different corpora were collected of audio recordings of isolated digits of English language speakers, in which speakers read numeric digits. Both of the collected corp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Pattern Recognition Letters
دوره 28 شماره
صفحات -
تاریخ انتشار 2007